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ABSTRACT 

An  algorithm  is  described  which  uses  a product  representation  of 

the  matrix  H , approximating  the  inverse  Hessian  matrix  of  the  function  to 

be  minimized.  It  is  shown  that  the  algorithm  generates  the  same  sequence 

of  points  as  the  Broyden-Fletcher-Goldfarb-Shanno-  method.  Using  a simple 

relation  between  the  traces  of  the  matrices  H.  and  H...  corresponding  to 

J j+1 

two  consecutive  points  x.  and  x.+^  the  superlinear  convergence  of  the 
algorithm  is  established. 
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A PRODUCT  VERSION  OF  A VARIABLE  METRIC  METHOD 
AND  ITS  CONVERGENCE  PROPERTIES 


Klaus  Ritter 

1.  Introduction. 

Various  kinds  of  variable  metric  methods  have  been  used  effectively 
in  the  unconstrained  minimization  of  a function  of  several  variables.  Their 
main  feature  is  the  use  of  rank  one  or  two  corrections  to  the  matrix  H which 
represents  an  approximation  to  the  inverse  Hessian  matrix  of  the  function  to 
be  minimized.  Recently  Brodlie,  Gourlay  and  Greenstadt  [1]  and  Davidon  [3] 
used  a representation  of  H as  a product  CC'.  They  showed  that  a rank 
two  correction  to  H reduces  to  a rank  one  correction  to  C. 

In  this  paper  an  algorithm  is  described  which  uses  a simple  rank  one 
update  formula  for  the  matrix  C.  It  is  shown  that  the  algorithm  gives  a prod- 
uct representation  of  the  matrix  H used  in  the  Broyden -Fletcher -Goldfarb- 
Shanno-method  [2] , [4],  [5],  [6].  Under  appropriate  assumptions,  the  se- 
quence generated  by  the  algorithm  converges  superlinearly. 

2.  Formulation  of  the  problem  and  notation. 

Let  x « En  and  let  F(x)  be  a real-valued  function.  We  assume  that 
F(x)  is  twice  continuously  differentiable  and  denote  the  gradient  and  the 
Hessian  matrix  of  F(x)  at  a point  x.  by  g.  = VF(x.)  and  G^  = G(x.),  re- 
spectively. A prime  is  used  for  the  transpose  of  a vector  or  a matrix. 

We  consider  the  problem  of  determining  a z such  that 
(2.1)  F(z)  < F(x)  for  all  x * z . 
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It  is  assumed  that  there  are  positive  numbers  p,  r|  and  L such  that 


(2.2) 


p||x||2  < x'G(y)x  < ri  ||  x | 


for  all  x,  y t E 


I G(x)  - G(y)  ||  < L ||  x-y  | 


for  all  x,y  t E . 


Assumption  (2.2)  implies  that  F(x)  is  uniformly  convex  and  that  there  is  a 
unique  z with  property  (2.1  ).  It  is  determined  by  the  condition  VF(z)  = 0 
For  later  use  we  state  two  more  well-known  results  which  are  immediate 


consequences  of (2.2): 


(2.  3) 
(2.4) 


| G(y) ||  < i!  , || ( G( y ) -1 1|  < p'1 


for  all  y * E 


There  are  numbers  > 0 and  \2  > 0 such  that  for  all  x « Er 
V1l|VF(x)||  < ||  x -z  ||  < \2||VF(x)||  . 


3.  The  Algorithm. 

In  a variable  metric  method  at  a given  point  x,,  the  search  direction 
Sj  is  determined  by  multiplying  the  gradient  g.  = VF(x.)  by  an  appropriate 
matrix  H.  , i.  e. , 

si  = hj9j  - 

where  H.  is  an  approximation  to  the  inverse  Hessian  matrix  of  F(x)  at  x.. 
With  a suitable  step  size  o\  a new  point 

xJ+i  = xj  " si 

is  computed.  If  g^^  = VF(x^+j)  ^ 0 , the  matrix  fs  determined  from 

Hj  in  such  a way  that 


(3.1) 


V9t  - Vi1  - si 


The  various  variable  metric  methods  differ  in  the  updating  procedure 


which  is  used  to  compute  from  H.  subject  to  (3.1).  In  many  methods 


H.  is  a symmetric  positive  definite  matrix  and  H.+^  is  obtained  by  adding 
two  symmetric  matrices  of  rank  one  to  . 

Any  symmetric  positive  definite  matrix  H <pan  be  written  in  the  form 
(3.2)  K.  = CJCj  , 

where  C is  a nonsingular  matrix.  Instead  of  adding  a rank  two  correction 

matrix  to  H.  we  can  add  a rank  one  matrix  u«'  to  C.  and  represent  H , 
j J J+l 

in  the  form 


(3.3)  H.+1  = (I  + uv')CC'  (I  + vu') 

where  the  vectors  u and  v are  determined  in  such  a way  that  (3.1)  is  satis- 
fied and  is  nonsingular. 

Brodlie,  Gourlay  and  Greenstadt  [1]  and  more  recently  Davidon  [3] 
have  investigated  representations  of  the  form  ( 3.  2)  and  ( 3.  3)  of  H.  and  H.+j, 
respectively,  for  some  variable  metric  methods.  In  the  following  we  shall 
describe  an  algorithm  which  uses  a very  simple  update  procedure  of  the  form 
(3.  3).  It  will  be  shown  that  this  algorithm  produces  the  same  matrix  H.+^ 
as  the  Broyden-Fletcher-Goldfarb-Shanno-method:  see  [2],  [4],  [5],  [6]. 

We  describe  a general  cycle  of  the  algorithm.  At  the  beginning  of  the  jth 
cycle  the  following  data  is  available:  C,  = (c..,...,c  .),  x , g and 

j ir  nj  r i 

a.,  = c|j  i=l,...,n.  For  the  initial  cycle  any  nonsingular  (nXn)- 
matrix  can  be  used. 

Step  1;  Computation  of  the  direction  of  descent  s .. 


n 

Z 

1=1 


‘U  wiJ 


3- 


Set 


Step  2:  Computation  of  the  step  size  <r 


Compute  <r.  such  that 

F(x.  - <r  .s.)  = min{F(x.  - a s.)  | <r  > 0 } 


X , = X - (T,  s 

j+1  j j J 

and  compute  g,...  If  g.,,=0  stop,  otherwise  go  to  Step  3. 
j+i  ]+i 

Step  3:  Computation  of  C.+j  . 

Compute 


hj  = °mV’ 
"J  - sS 9,  - 


i = 1 , . . . , n , 

6j = si  9m  ’ 


“u  ° • V)'X  ‘‘’"d  + hi1. 


°i,i+l  ‘ cu  + “u  s) 

*4,1+1  = py  + “u  6i 


i = 1,.  . . ,n 


Replace  j with  j+1  and  go  to  Step  1. 


Remark: 


In  Step  1,  the  search  direction  s.  is  determined  by 

•i  * cici9i  • 

In  Step  2,  we  assume  that  <r . is  the  optimal  step  size.  This  assumption  is 
made  in  order  to  avoid  some  technical  difficulties  in  the  convergence  proofs. 
The  algorithm  also  works  with  an  approximation  to  the  optimal  step  size  which 
after  a finite  number  of  steps  can  be  set  equal  to  1 . This  will  be  shown  in  a 
much  more  general  context  in  a subsequent  paper.  In  Step  3,  C^_^  is  ob- 
tained by  simply  adding  a multiple  of  s^  to  each  row  of  C..  Obviously 


6 = 0,  if  <t  is  the  optimal  step  size. 

3 J 

In  the  following  theorem  it  will  be  shown  that  the  search  directions 
s generated  by  the  above  algorithm  are  identical  with  the  search  directions 
used  in  the  Broyden -Fletcher-Goldfarb-Shanno-method. 

Theorem  I ; 

Let  c,  , . . . , c , <r.,  s.,  g.  and  g..  be  defined  by  the  algorithm, 
lj  ’ nr  y y ) j+1 


Set 


Cj  = H)=CiCS’ 

si  , _ il9i±L 


p) = ra  ’ 


and 


dip,  + d!  H.d. 
H.  , = H.  + — ^ ^ 


j+1  j 


(djPj) 


PjP)  • 


pd’.H.  + H.d.  pi 

J ) ) LI  L 


dSPi 


Then 


and 


Proof: 


Vi  = cj+i  cjti  Vi 


Hw  * ci*l  ci*i  • 


The  first  assertion  follows  immediately  from  Steps  1 and  3 of  the 
algorithm  since 

°M+iVi  s 1 Wj  = *1,W 

In  order  to  prove  the  second  statement  we  observe  that 


i,  j+1  ij 


- c! . = ((n/(1  - 6 V"1  )cr  . - 1) 


j'j  ' j 


+ cy9i  ti-)s, 

Sj(9j’gj+i)  si!VVi’  1 


= ( V(1  -fi.v  )<r 


c'  g 
ij_j_ 


■ci>l9rv)  ,p 

/ _ _ \ ' rj 


Pi  1 ps <v9w 1 Pilv  V ' 


Since 


J 


11  - 6i 


I <y . p'(g. 

= 7 J-L..1 

v n « 


• v 


p;  gi 


p;v  • v 


.v; 

r\  ’ 


Vi  VWi 


it  follows  that 

c;+,  - c;  * ■ df?7  ci  dj’pi 


ill! 


] i 


Therefore 


cj+1cj+1=  C.c.  + (cj+1"Cj)c;  + cj(ciVc;)  + (c.+rc.)(c;+1 


cj> 


Hj  + Pj ( N/pJgj  d'.p. 


d'.  H. 
-J.  .1. 


s . 
J 


Vi 


+ 1 


1 


Nypjg.djp.  -J  II  sj| 


+ P:( 


g'.  H,  g, 

J ) i 


j'  d]Pjgj  Pj  II  Sjl 


. hjJ?l 

d'.p, 

3 J 

2 d'.H.g. 

JJ-i 


)p' 


d'.  p.'s/ p'.g. d'.p.  II  s . 
j j j j 3 3 3 


r-  + 


d'.H.d. 

■ J...J  J 

,dSpi’2 


)P' 


d'p  + d'H.d  p d'  H + H.d  p' 

TT  | n ) J J D D,  J J J L_L_L 

" J 2 PjPj  " d'p 

3 (dj  p.r  ] 3 n 

4.  Convergence. 

if  the  algorithm  terminates  with  some  x. , then  VF(x.)  = 0 and  the 
assumptions  on  F(x)  imply  that  x.  is  the  unique  global  minimizer  of  F(x). 
For  the  remainder  of  the  paper  we  shall  assume  that  the  algorithm  generates 
an  infinite  sequence  {x. } . In  order  to  prove  that  it  converges  to  the  global 
minimizer  of  F(x)  we  shall  use  the  equivalence  with  the  BFGS- Method 
established  in  Theorem  1. 


First  we  shall  write  H.  as  a sum  of  n matrices  of  rank  one.  For 


this  purpose  let 


and 


Then  ||  p. 

H.  since 
J 


P,  - I 


H.p.g.  = P. 
J j j j 


H.g.  . 

J J+l 


-1 


, H.K.g.  , = q.  . 
J J J+l  J 


||  =1  and  p g^  and  X.g^  are  conjugate  with  respect  to 


p.g'.  H.g.  . = p'g.  , = 0 . 

T)  J J+l  j J+l 

Assuming  that  H.  is  positive  definite  we  can  find  vectors  d .....  ,d 
J 3j  nj 

such  that 


H.d.  = p.4  , 
J ij  U 


l"u>  -1- 


i = 3, . . . ,n 


and 


p.g.,  X .g.  , , d„  d , 

Kj  j’  j i+r  3j*  * nj 

are  conjugate  with  respect  to  FF.  Then 


P,  P 1 q . q 1 


n P„  PI. 


(4.1) 

and 
(4.  2) 


h.  = -V-L+— lj — + r,  -H-+1- 

j p.g!P.  X.g  . q.  /J  d . p , 

J J j J J+l  J 1=3  u ij 


H 


9;pi 


a 1 , a LJ  d ' p 
yj+lqj  i = 3 Uij  ij 


Using  (4.1)  we  can  easily  derive  a corresponding  expression  for  H.^. 


First  we  observe  that 


Hd  = .Higl ' Hj V - J_ 

il  IkjSjll  °-j‘j  "J  IKjS. 


P,  - T 


H.g.  ,1 

J.Vl' 


implies  that 


H.+1x  = H.x  for  x t T . = {x| Pjx  = q^x  = 0 } . 
d , . . . ,dn4  « T4  we  have  to  change  only  the  first  two  terms  in  (4. 1) 


nj  j 


in  order  to  obtain  a representation  of  H^,  i.e.  , we  have  to  determine  two 

vectors  in  the  span  of  g.  and  g which  are  conjugate  with  respect  to  H 

J J+l  j+l 


-7 


We  have 


(4.3)  d.  t span  {g^g^},  H.+1d.  = p.,  and  dj  H.+Jd.  = djp.  >0. 

Furthermore, 

(4-41  «i+i  Hjtl  d(  = 9i«  pj  ■=  0 • 

Thus  g and  d^  are  conjugate  with  respect  to  H^.  By  the  definition  of 


"j+1 

(4.  5) 


and 

(4.6) 

With 


d.'  H.  g. 

V9w  * HiV  - h 

Mg,  -«,p,)||  H,gJ+1  II  , 


d'.q, 

«.  = hH- 

1 d;pj 


9i+l  Hj+I9J«  = 9i+l9J»H(9W»  * °- 


pj+i  " Hj+i9i+i 11  ’ Hj+i  pj+i  q)ti  pj+i 

it  follows  from  (4.1)-(4.6),  that  H.+^  and  H ^ are  positive  definite  and 

can  be  written  in  the  form 


<4.7)  V = - 

and 

«4-8>  hhi=- 

By  the  definition  of  H 


p p’ 
J+1  pj+l 

t p)pj 

n F 

j+1 = 

pj+lgj+lp  j+1 

d'  p 

J Pj 

+ i = 3 « 

-1 

pj+lgj+lgj+l 

d.  d; 

. f i 

n 

. , V 

j+1 = 

gj+l  pj+l 

Vi 

+ L 

i=l 

d'  p ' 


Hj  + 2X  = Hj+lX  f°r  X£  T1+l  = {xlHi+igi4-l  =Hi  + igi  + 2=0}' 


j+lyj+l  " j+l*j  + 2 


Setting 


q =.VL!i±2_  x = i 

j+i  iiVw  J+1  Hj+i  ^j+2 


and  observing  that 


d g'  H g =P'g  =0 

Pj+1  g j+1  j+1  j + 2 j+1  yj+2 


we  can  write  H.  . and  H . as  follows 

j+1  j+1 


(4.9)  H 
and 


Vi  pl+i  , Viql+i  + 


n p,  ,.,Pl  4 


y l>j+l  i j j+1 

)fl  pj+l9j+lPj+l  Xj+Igj+Zqj+1  i=3  di , j+1  Pi , j+1 


..  o'1  PH-l9j-nVl  , Xj+1  gj  + 2 gj±2  , V +1_ 

j+1  " gi+lPj+i  gj+2  ^j+l  i=3  di,j+lPi,j+l 

where  d d , , are  vectors  in  T , with 

3 , j+1  n,j+l  1+1 

d;,j+i  % dk.j+i  = 0 - l*k’ 

||Hi+idt,i+iB  = ‘ ’ Hi+idi,j+i = pi,i+i’  i = 

This  representation  is  completely  analoguous  to  (4.1  ) and  (4.  2)  and  can  be 

used  to  derive  a representation  of  H as  a sum  of  n matrices  of  rank  one. 

J+2 

In  the  following  convergence  proof  we  shall  use  a simple  argument  involving 
the  trace  of  the  matrices  H.  and  H.+^.  By  definition,  the  trace  of  a square 
matrix  M , denoted  by  tr(M),  is  the  sum  of  the  diagonal  elements  of  M . 

From  (4. 1 ) and  ( 4.  2)  we  have 

P P'.  q.q'  n P.  ,P. . 

(4.11)  tr(H  ) = tr<+£>  ♦ trt-j-XX—)  + £ 

1 { j j+1  j 1=3  ij  ij 


j j } 
1 


1 


n 

V 


1 


p.  g'.  P,  g'. , ,q.  d! . p. . 

J J J ) J+1  J 1=3  ij  ij 


and 
(4.12)  tr(  H" 


P1  9j " , 11  gj+l 11  , f 

} g;  Pj  h 


Mg. 


a IK 


h - d'  P 

j+1  j i=3  ij  ij 


Setting  t.  = (1  + \*\\  9j+1  II  9j+1  \ wj  = U + lldjll  >/dj  Pj  > 


(4.13) 

and 


n 1 + K. 


v\  ~ Tj  + U d'  p 
1 1 i = 3 ij  ij 


(4.15) 


1 + X2  ||  g.  II 2 n 1 + ||  d.  . ||  “ 

j-t-1  " j + 2 11  y " i , j+1 

X a'  Q ‘-J  d'  P 

j+l9j+2  Vl  i=3  i,j+l  i,j+l 


we  conclude  from  (4.7)  - (4.12)  that 


(4.16) 


<P.  = tr(H.  )- — + tr(H'J  ) - J+-1-  — 

1+1  )+1  V+iV  1+1  9S«p]+i 


n 1 + | d. . | 

= w + V 'Llll 

j i=3  dij  Pij 


<P.  - T . + OJ  , . 
) J J 


The  equality  (4.16)  will  be  used  to  prove  that  the  sequence  of  gradients  pro- 
duced by  the  algorithm  converges  to  zero.  First  we  observe  that  <p  > 0 
since  H j is  positive  definite  and,  therefore,  all  terms  on  the  right  hand 
side  of  (4.15)  are  positive.  Secondly  we  shall  show  (Lemma  1)  that,  because 
of  the  assumptions  on  F(x),  the  sequence  {w . } is  bounded.  This  implies 
that  there  is  a constant  y and  an  infinite  set  J C (0,1,2, ...  } such  that 


(4.17) 


T . < y for  j t J 


Using  the  definition  of  t.  and  p it  can  easily  be  shown  (Lemma  2)  that 


there  is  e > 0 such  that 


g;+ipj+i  * e»v 


for  j c J . 


By  a routine  argument  it  follows  from  this  that 

l|gj+lll  - 0 as  j -*°0  , J C J , 
which  by  the  uniform  convexity  of  F(x)  implies 


0 as  j — oo 


Lemma  1. 


1 + Id. 


“j  = 


for  all  j 


-10- 


Proof: 


By  Taylor's  theorem  there  are 


such  that 


4 ri  « |x|x=x.  - t(<x  ,s .),  0 < t < 1 } 
1 j 1 i i ~ ~ 1 


ps  9i+i  = pi9j-"jp;G,5],sj 


Therefore, 


(4.18) 


d'.g!  . = d',  g,  - a-,  d'.  G(ri.)s. 

11+1  11  11  11 


(g  - g.+1)'P 

dS  pi  - ’ pi  G,Vpi  2 p 


(4.19) 


Lemma  2. 


d ' ( g -q  ) 

2 i gj  W 


ttt~=  d;G^i)Pj  i 11^11110(^)11  <r, ii d. ii 


Proof : 


For  every  y > 0 there  is  e > 0 such  that,  for  all  j , 

1+  N1+l»2 

V yg^/q, implles  VlVl^V 


If  T.  < y , then 

IlSj+lH  1 K+J 


T^T  * v 


By  (4.5), 


Vi  = V W 


Since,  by  (4. 18)  and  (4. 19), 

*^L<  x+  a , 


] J 


iPi 


d'.P 
) ) 


it  follows  that 


ViVi 


g'  a 

- „ _H_  i+i  i > 

1^+ilT  - 1+1  ||gJ+]||  - vd+h) 


Theorem  2. 


The  sequence  (Xj)  generated  by  the  algorithm  converges  to  a 


such  that 


and 


Proof : 


F(z)  < F(x)  for  all  x * z 


VF(z)  = 0 


By  (4. 17)  and  Lemma  1 and  2 there  is  e > 0 and  an  infinite  set 
JC  {0,1,2,...}  such  that 

gJPj  > e |t  g j ||  for  J * J. 

Furthermore,  it  follows  from  Taylor's  theorem  that,  for  some  £.  t {x  | x 
x.  - t(<r  s^ ) , 0 < t < 1}, 

F(x.  - 0-s.)  - F{Xj)  = -(rg.'Sj  + £cr2sjG(£  ,)s  . 

< -<r  gjsj  + i or 2t7  |J  s ||  2 . 

Thus  choosing 


a = 


we  obtain,  for  j ( J , 


g;  si 

tills,!' 


-12- 


" a 

'•  ^ ****** 


F(x . ) - F(x.)  < F(x.  - or  s.)  - F(x  ) 

J+l  j ~ J ) J 

, (g'<  sj2 


i -rr  »v- 

Since  F(x)  is  bounded  from  below  and  F(x^)  < F(x.)  for  all  j , it  follows 


||g.||  0 as  j -*  ® , )£  J . 

Let  z be  a cluster  point  of  the  sequence  {x.,  j t J}.  Then  VF(z)  = 0.  By 
the  uniform  convexity  of  F(x),  z is  the  unique  global  minimizer  of  F(x). 
Since  F(x.+j)  < F(x^)  for  all  j , the  sequence  {xp  has  no  cluster  point 
except  z . Thus  x.  -*  z as  j -*•  00  . 

5.  Superlinear  convergence. 

In  order  to  prove  that 


0 as  j — » 


we  shall  again  use  the  trace  of  the  matrices  H.  and  H.^j.  The  argument 
will  be  based  on  a modification  of  the  formula  (4.16). 

j 

By  Theorem  2,  x - z as  J - *>.  Let  G = G(z)  and  denote  by  G‘ 

i 

the  square  root  of  G . By  definition  G2  is  a symmetric  positive  definite 
matrix  with  the  property 

1 1 

G <*  32G2  . 

1 ‘ 1 

We  set  G 2 = (G2)  Repla  mg  the  matrices  FL  and  by 

1 i 1 ill 

G'H.G'  and  (G^G2)'  = G'2  H"  G-2  , 

respectively,  in  the  formulae  (4.1),  (4.  2)  and  (4.7)  - (4. 16)  we  obtain 
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y'x  > ||  Gx-y  ||  Z < r,(r,  + p)(r-2) 

and  ||Gx  - y||  < v implies 

T - 2 < *7  II  Gx  - y ||  2 . 

P 


Proof : 


Set  v = y - Gx.  Then 


x'Gx  + y'G~V 

T y'x 


Zx'Gx  + 2v'x  + v'G ~^v 
x'Gx  + v'x 


= 2 + 


v G "*v 


x'Gx  + v'x 


Thus  r > 2 and 


12  1 

— ||  v ||  < v'G  v = ( t-2)(x'Gx  + v'x) 

o — 


< (T-2)(||G||  + It v || ) < (t -2)(r)  + || v || ) 
Therefore,  we  have 


HI  < (T  - 2)^1  jj-^-jj-  + 1] 

which  implies  that  for  t-2  sufficiently  small  || v ||  < 0.  5|x.  Thus, 
sufficiently  small, 

ilvl|2  < r1(r1+p)(T-2) 


and 


y'x  = x'Gx  + v'x  > p - ||  v ||  > 0.  5p  . 
Finally,  for  ||  Gx-y  ||  < 0.5 p , 


t-2  < ■M-Vi.ll  < 


i*  - » - 


Lemma  4. 


There  is  a constant  y > 0 such  that,  for  all  J , t - 2 < ' 
kJ+1  - Tjl  = 0(max  -2  , ||g  ||})  . 


for  \ 


implies 
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Proof: 


By  (4. S) 


p)tl  ■ (i,-  vyllv  Vr 

-I 


Therefore,  Pj+[  = xj+1  II  qj  ' °i  Pj  II  Spiles 

p,+i9i+i  Vi  ' lVi9i+iqil|qi  ■ Vi1 


Vi  8j+iVWj 


-2 


Furthermore, 


and 


-2. 


pS+iGVi  = “V  Vj11  VW0 'VVi* 


i+i 


p;«Gp»tl ' 49,ViG'‘9iM 

pj+i9i+ipj+i 


(q,  - «,P|)'G(q|  - «,P|>  + cft|l  <■)-», 


Kj+l9j+iqj 

^PjGP,  - 

j Vig;+  qi 


By  definition,  a = d'.q and 

d',q.  = Pj  Gqj  + (dj  - PjOq^ 


Therefore, 


= pi|Vi+i9i+i)  - pi’'Wi  • GV  + (dt  - piG,qr 


id;qji  5 ■ Gqt11  + ||dj-Gpi| 


Since,  ||d.  - G P.||  = 0(max{||g.||,||gj+1||  },  d'.p.>n  and  Lemma  3 implies 
that  for  \ sufficiently  small 


»ViVi-Gq)11  =0(Vvri 
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. -j  j£4erCb<C  , : 


and 


we  have 


*,  l|  G||  t 2a  ||  G | 

I 4>  - T . I < — ^ - 

J+l  j 1 - 0.  5u 


= 0(max{Vrj  - 2 , ||  g j ){  ,||gj+1||}). 


Lemma  5. 


There  is  a constant  \ > 0 such  that,  for  all  j , 4^  - 2 < y 


implies 


|<r.  -1|  = 0(max  - 2 , || gj  || , ||  g.+1 1|  }) 

HOI  

-|g.||"  = °<maxW^j  - ll^ll  l|gJ+1ll  })• 


Proof : 


By  Taylor's  theorem  there  is 


such  that 


t . t (x|  x = Xj  - t(o\s  ) , 0 < t < 1} 


0 = sigj+i = s;gj  - ajssG(ej)sj  • 


(5.  2) 


(T  = L-t = 1 - 

J SJGl^ls. 


g’s,  - s'.Giejs. 


s’G(£.)Sj 


Since  with  vj  = p g - Gp.  we  have 
1 j 3 3 


g's.  = ( G+  — i)s  = s'Gs  + v' s II  s I 

J 3 Pj  Pj  j 3 1 J J J 


and  it  follows  that 
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'V 


s!Gsj  + v]sjHsjll  ' siG(Vsj 
siG<6j|sj 


s;(G-G(eJ)s4  + II V 


j j 


Y 11  Y 


M S, 


< -die  - G(ej)ii+n v.| 


= 0(maxW4i.  - 2 , ||9,  IUI  9j+1  II  >, 

where  the  last  equality  follows  from  Lemma  3 and  (2.4)  since 

IG-GU,)!!  < ||  G - Gj  ||  + ||  Gj  - G(^)||  < L||x.  - z ||  + L||x.+1-x 


y " - 


Setting 


rl 

E . = J G(x  - t cr  s ,)dt  - G 
Jr,  J i i 


we  have  again  by  Taylor's  theorem 


Vi  = 9rTiGsi  - 'iVj 


= g.  - GSj  + (1  • ir.)Gs.  - tr.E^Sj 


and 


. «VGV  , l°«)l 
~Tg')r'”+ll-,r,lT^ 


+ II  E, 


'Vj 


llpjgj  ■ Gpfll  llGPJ 


+ E 


J II 9, 


Since  ||GP.  ||  > M we  obtain  from  Lemma  3,  that  for  \ sufficiently  small 

IIPjgj  * GPj  II  = °(  * 2 ) and  II  Pjgj  H - 5M  • 

Furthermore,  by  (5.2) 


and  by  (2.  4), 


- g. 


l9jll  II  3 j II  j 


H II  Sj  II 


2 - 
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II  Ej  II  1 II  /1<3(xj  * t,rJSj,dt  " Gjl!  + 11  Gj  ' G|1 

< max  ( ||  G(x  - t «r  s ) - G ||  + ||  G - G || 

0 < t < 1 1 ‘ ‘ J J 

- LHxj+i_Xjll  + LHXj  - ZH  = °(max{||g.||  ,gj+1||  }). 
Therefore  we  have 


^|p  = 0(max{\/^.  - 2 , ||  g . ||  , ||  gj+1 


Theorem  3. 


Let  the  sequences  {a.},  {9  } and  {x.}  be  generated  by  the  algo- 


rithm. Then 
i) 


'*1  • 


as 


00 


ii) 


as 


iii) 


Proof: 


|xj+rz|1 

II  Xj  - 2 I 


0 as  j -*■ 


00 


By  Theorem  2,  ||  g ^ || 0 as  j — If  t . -*  2 as  j -*  *>  , it  follows 

from  Lemma  4 that  4^  — 2,  which  by  Lemma  5 and  (2.4)  implies  the  state- 
ments of  the  theorem.  Thus  it  suffices  to  show  that 


r . -*  2 as  )—•<*>. 
j 

Since  > 2 this  is  equivalent  with  proving  that  for  every  e > 0 there  is 
j(e)  such  that 


Tj  < 2 + e for  j > j(e)  . 

Since  ||d  - Gp^||  -*  0 as  j -*  <»  , it  follows  from  Lemma  3,  that  for  j suf- 
ficiently large, 


-19- 


(5.  3)  w.  = 2 + 0(  ||  d.  - G p.|| ) 

= 2 + 0(max{g.|| , ||gj+1 

Therefore,  by  (5.1) 

Vi  = v\ " Ti  + 2 + 0(max^lg)H ’ Hgj+1 

Let 

J = 1)1  Tj>  2 + e>- 

Since  ||g  ||  - 0 as  J - «>,  there  is  ^ such  that 

(5.4)  ?J+1  < <P.  - (2  + e)  + 2 + 0(max{||g.|| , II gj+1  H >> 

< v - f°r  ) > ),  anc*  ) * J* 

j 2 1 

If  j ^ J and  e is  sufficiently  small  it  follows  from  Lemma  4 and  5 that  there 


is  such  that 
(5.5) 


i gj+2  II  1 °-  5 II  <3j+1 1!  ) >J2  and  jl  J . 


Thus,  if  i-U  I,  j + i 1 J,  i = 0, . . . ,k-l,  1 + kc  J it  follows  from  (5.  3) 
and  (5.  5)  that  for  j > j 2 

(5.6)  Vk'”)  = V + ",+(V' W 

= 0(  ||  gj  || ) + 0(  ||  g.+1 1| ) + ...  + °<  llgj+k-1H) 

= 0(  ||g.  |! ) + 0(2||gj+1||)  . 

< - for  j sufficiently  large  . 

— 4 

Since  v > 0 , (5.4)  and  (5.6)  imply  that  J has  at  most  finitely  many 

) 

elements. 

As  a further  application  of  (5. 1)  we  obtain 
Theorem  4 . 

The  sequences 

{ II  H,  ||  } and  1 II  Hj 1 II  } 
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are  bounded. 


Proof: 


From  (5.1)  we  have 


w = <p  + ' I CP  - <i>  ) 

j o *-»  i-r 

j-i 

-0  + - V 


< <P„  + ' (u.  - 2) 

- 0 ,u,  i 

1=1 

because  2 - r.  < 0 for  all  i . Since  II  d - G P II  -*  0 as  j -*  oo  it 
l - " j j" 

follows  from  Lemma  3 that 

1 K -2)  = °(  t II 9;  II2)  = o(  Yj  IlgJI2)  < 00  • 

i=i  i=l  1 i=l  1 

Thus  the  sequence  {<p, } is  bounded.  Moreover,  in  the  proof  of  the  previous 

theorem  it  has  been  shown  that  t -*•  2 as  j -*  oo.  Thus  Lemma  4 implies 

that  4,  — 2 as  j -*■<».  Therefore 

i i iii 

0 < tr(G 2 H.  G2)  4>  tr(G~2  H"  O'2)  = <P.  + i|< 
is  bounded.  Since  the  trace  of  a matrix  is  equal  to  the  sum  of  its  eigenvalues, 


this  means  that  there  is  a uniform  upper  bound  for  the  eigenvalues  of  the 
i i i - _ j 

matrices  G2  H,  G2  and  G 2 H~  G 2,  j = 0, 1, 2, . . . . Thus  there  is  a 
uniform  upper  bound  for  the  numbers  ||H.||  and  ||  H . ^ ||  , j=  0,1,2,...  . 
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